462 research outputs found

    Testing the additional predictive value of high-dimensional molecular data

    Get PDF
    While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature. We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to two publicly available cancer data sets. Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available

    ProbCD: enrichment analysis accounting for categorization uncertainty

    Get PDF
    As in many other areas of science, systems biology makes extensive use of statistical association and significance estimates in contingency tables, a type of categorical data analysis known in this field as enrichment (also over-representation or enhancement) analysis. In spite of efforts to create probabilistic annotations, especially in the Gene Ontology context, or to deal with uncertainty in high throughput-based datasets, current enrichment methods largely ignore this probabilistic information since they are mainly based on variants of the Fisher Exact Test. We developed an open-source R package to deal with probabilistic categorical data analysis, ProbCD, that does not require a static contingency table. The contingency table for
the enrichment problem is built using the expectation of a Bernoulli Scheme stochastic process given the categorization probabilities. An on-line interface was created to allow usage by non-programmers and is available at: http://xerad.systemsbiology.net/ProbCD/. We present an analysis framework and software tools to address the issue of uncertainty in categorical data analysis. In particular, concerning the enrichment analysis, ProbCD can accommodate: (i) the stochastic nature of the high-throughput experimental techniques and (ii) probabilistic gene annotation

    Literature-aided interpretation of gene expression data with the weighted global test

    Get PDF
    Most methods for the interpretation of gene expression profiling experiments rely on the categorization of genes, as provided by the Gene Ontology (GO) and pathway databases. Due to the manual curation process, such databases are never up-to-date and tend to be limited in focus and coverage. Automated literature mining tools provide an attractive, alternative approach. We review how they can be employed for the interpretation of gene expression profiling experiments. We illustrate that their comprehensive scope aids the interpretation of data from domains poorly covered by GO or alternative databases, and allows for the linking of gene expression with diseases, drugs, tissues and other types of concepts. A framework for proper statistical evaluation of the associations between gene expression values and literature concepts was lacking and is now implemented in a weighted extension of global test. The weights are the literature association scores and reflect the importance of a gene for the concept of interest. In a direct comparison with classical GO-based gene sets, we show that use of literature-based associations results in the identification of much more specific GO categories. We demonstrate the possibilities for linking of gene expression data to patient survival in breast cancer and the action and metabolism of drugs. Coupling with online literature mining tools ensures transparency and allows further study of the identified associations. Literature mining tools are therefore powerful additions to the toolbox for the interpretation of high-throughput genomics data.UB – Publicatie

    Outcome-related metabolomic patterns from 1H/31P NMR after mild hypothermia treatments of oxygen–glucose deprivation in a neonatal brain slice model of asphyxia

    Get PDF
    Human clinical trials using 72 hours of mild hypothermia (32°C–34°C) after neonatal asphyxia have found substantially improved neurologic outcomes. As temperature changes differently modulate numerous metabolite fluxes and concentrations, we hypothesized that 1H/31P nuclear magnetic resonance (NMR) spectroscopy of intracellular metabolites can distinguish different insults, treatments, and recovery stages. Three groups of superfused neonatal rat brain slices underwent 45 minutes oxygen–glucose deprivation (OGD) and then were: treated for 3 hours with mild hypothermia (32°C) that began with OGD, or similarly treated with hypothermia after a 15-minute delay, or not treated (normothermic control group, 37°C). Hypothermia was followed by 3 hours of normothermic recovery. Slices collected at different predetermined times were processed, respectively, for 14.1 Tesla NMR analysis, enzyme-linked immunosorbent assay (ELISA) cell-death quantification, and superoxide production. Forty-nine NMR-observable metabolites underwent a multivariate analysis. Separated clustering in scores plots was found for treatment and outcome groups. Final ATP (adenosine triphosphate) levels, severely decreased at normothermia, were restored equally by immediate and delayed hypothermia. Cell death was decreased by immediate hypothermia, but was equally substantially greater with normothermia and delayed hypothermia. Potentially important biomarkers in the 1H spectra included PCr-1H (phosphocreatine in the 1H spectrum), ATP-1H (adenosine triphosphate in the 1H spectrum), and ADP-1H (adenosine diphosphate in the 1H spectrum). The findings suggest a potential role for metabolomic monitoring during therapeutic hypothermia

    Similar gene expression profiles of sporadic, PGL2-, and SDHD-linked paragangliomas suggest a common pathway to tumorigenesis

    Get PDF
    Contains fulltext : 81540.pdf (publisher's version ) (Open Access)BACKGROUND: Paragangliomas of the head and neck are highly vascular and usually clinically benign tumors arising in the paraganglia of the autonomic nervous system. A significant number of cases (10-50%) are proven to be familial. Multiple genes encoding subunits of the mitochondrial succinate-dehydrogenase (SDH) complex are associated with hereditary paraganglioma: SDHB, SDHC and SDHD. Furthermore, a hereditary paraganglioma family has been identified with linkage to the PGL2 locus on 11q13. No SDH genes are known to be located in the 11q13 region, and the exact gene defect has not yet been identified in this family. METHODS: We have performed a RNA expression microarray study in sporadic, SDHD- and PGL2-linked head and neck paragangliomas in order to identify potential differences in gene expression leading to tumorigenesis in these genetically defined paraganglioma subgroups. We have focused our analysis on pathways and functional gene-groups that are known to be associated with SDH function and paraganglioma tumorigenesis, i.e. metabolism, hypoxia, and angiogenesis related pathways. We also evaluated gene clusters of interest on chromosome 11 (i.e. the PGL2 locus on 11q13 and the imprinted region 11p15). RESULTS: We found remarkable similarity in overall gene expression profiles of SDHD -linked, PGL2-linked and sporadic paraganglioma. The supervised analysis on pathways implicated in PGL tumor formation also did not reveal significant differences in gene expression between these paraganglioma subgroups. Moreover, we were not able to detect differences in gene-expression of chromosome 11 regions of interest (i.e. 11q23, 11q13, 11p15). CONCLUSION: The similarity in gene-expression profiles suggests that PGL2, like SDHD, is involved in the functionality of the SDH complex, and that tumor formation in these subgroups involves the same pathways as in SDH linked paragangliomas. We were not able to clarify the exact identity of PGL2 on 11q13. The lack of differential gene-expression of chromosome 11 genes might indicate that chromosome 11 loss, as demonstrated in SDHD-linked paragangliomas, is an important feature in the formation of paragangliomas regardless of their genetic background.1 p

    Investigating the validity of the DN4 in a consecutive population of patients with chronic pain

    Get PDF
    Neuropathic pain is clinically described as pain caused by a lesion or disease of the somatosensory nervous system. The aim of this study was to assess the validity of the Dutch version of the DN4, in a cross-sectional multicentre design, as a screening tool for detecting a neuropathic pain component in a large consecutive, not pre-stratified on basis of the target outcome, population of patients with chronic pain. Patients’ pain was classified by two independent (pain-)physicians as the gold standard. The analysis was initially performed on the outcomes of those patients (n = 228 out of 291) in whom both physicians agreed in their pain classification. Compared to the gold standard the DN4 had a sensitivity of 75% and specificity of 76%. The DN4-symptoms (seven interview items) solely resulted in a sensitivity of 70% and a specificity of 67%. For the DN4-signs (three examination items) it was respectively 75% and 75%. In conclusion, because it seems that the DN4 helps to identify a neuropathic pain component in a consecutive population of patients with chronic pain in a moderate way, a comprehensive (physical-) examination by the physician is still obligate

    Heading Down the Wrong Pathway: on the Influence of Correlation within Gene Sets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Analysis of microarray experiments often involves testing for the overrepresentation of pre-defined sets of genes among lists of genes deemed individually significant. Most popular gene set testing methods assume the independence of genes within each set, an assumption that is seriously violated, as extensive correlation between genes is a well-documented phenomenon.</p> <p>Results</p> <p>We conducted a meta-analysis of over 200 datasets from the Gene Expression Omnibus in order to demonstrate the practical impact of strong gene correlation patterns that are highly consistent across experiments. We show that a common independence assumption-based gene set testing procedure produces very high false positive rates when applied to data sets for which treatment groups have been randomized, and that gene sets with high internal correlation are more likely to be declared significant. A reanalysis of the same datasets using an array resampling approach properly controls false positive rates, leading to more parsimonious and high-confidence gene set findings, which should facilitate pathway-based interpretation of the microarray data.</p> <p>Conclusions</p> <p>These findings call into question many of the gene set testing results in the literature and argue strongly for the adoption of resampling based gene set testing criteria in the peer reviewed biomedical literature.</p

    Integrated analysis of DNA copy number and gene expression microarray data using gene sets

    Get PDF
    Background: Genes that play an important role in tumorigenesis are expected to show association between DNA copy number and RNA expression. Optimal power to find such associations can only be achieved if analysing copy number and gene expression jointly. Furthermore, some copy number changes extend over larger chromosomal regions affecting the expression levels of multiple resident genes.
    corecore